————————————————————————————————————————–
The data set we are exploring focuses on the binomial success or failure of whether the crime rate is above (1) or below (0) the median for a particular neighborhood. There are 466 rows of data, each representing neighborhoods located in the Boston metropolitan area. For each neighborhood we are provided with 13 attributes that could potentially be used as predictor variables and one response variable (“target”) that indicates whether or not the neighborhood does, in fact, have an above-median crime rate. Our assignment is to try to predict whether or not a neighborhood would be more likely to a have an above-mian crime rate based on the 13 attributes provided in the data set. A summary table for the data set is provided below.
.
.
.
.
.
.
.
.
.
.
.
.
. ————————————————————————————————————————–
LotFrontage(4) 259, GarageYrBlt(60) 81, MasVnrArea(27) 8
Electrical(43) 1459, MasVnrType(26) 1452, BsmtQual(31) 1423, BsmtCond(32) 1423, BsmtFinType1(34) 1423, BsmtExposure(33) 1422 BsmtFinType2(36) 1422, GarageType(59) 1379, GarageFinish(61) 1379, GarageQual(64) 1379, GarageCond(65) 1379, FireplaceQu(58) 770 Fence(74) 281, Alley(7) 91, MiscFeature(75) 54, PoolQC(73) 7
| vars | id | n | mean | sd | median | min | max | range | skew | kurtosis | se |
|---|---|---|---|---|---|---|---|---|---|---|---|
| MSSubClass | 2 | 1460 | 57 | 42 | 50 | 20 | 190 | 170 | 1 | 2 | 1 |
| LotFrontage | 4 | 1201 | 70 | 24 | 69 | 21 | 313 | 292 | 2 | 17 | 1 |
| LotArea | 5 | 1460 | 10517 | 9981 | 9478 | 1300 | 215245 | 213945 | 12 | 202 | 261 |
| OverallQual | 18 | 1460 | 6 | 1 | 6 | 1 | 10 | 9 | 0 | 0 | 0 |
| OverallCond | 19 | 1460 | 6 | 1 | 5 | 1 | 9 | 8 | 1 | 1 | 0 |
| YearBuilt | 20 | 1460 | 1971 | 30 | 1973 | 1872 | 2010 | 138 | -1 | 0 | 1 |
| YearRemodAdd | 21 | 1460 | 1985 | 21 | 1994 | 1950 | 2010 | 60 | -1 | -1 | 1 |
| MasVnrArea | 27 | 1452 | 104 | 181 | 0 | 0 | 1600 | 1600 | 3 | 10 | 5 |
| BsmtFinSF1 | 35 | 1460 | 444 | 456 | 384 | 0 | 5644 | 5644 | 2 | 11 | 12 |
| BsmtFinSF2 | 37 | 1460 | 47 | 161 | 0 | 0 | 1474 | 1474 | 4 | 20 | 4 |
| BsmtUnfSF | 38 | 1460 | 567 | 442 | 478 | 0 | 2336 | 2336 | 1 | 0 | 12 |
| TotalBsmtSF | 39 | 1460 | 1057 | 439 | 992 | 0 | 6110 | 6110 | 2 | 13 | 11 |
| X1stFlrSF | 44 | 1460 | 1163 | 387 | 1087 | 334 | 4692 | 4358 | 1 | 6 | 10 |
| X2ndFlrSF | 45 | 1460 | 347 | 437 | 0 | 0 | 2065 | 2065 | 1 | -1 | 11 |
| -LowQualFinSF | 46 | 1460 | 6 | 49 | 0 | 0 | 572 | 572 | 9 | 83 | 1 |
| GrLivArea | 47 | 1460 | 1515 | 525 | 1464 | 334 | 5642 | 5308 | 1 | 5 | 14 |
| BsmtFullBath | 48 | 1460 | 0 | 1 | 0 | 0 | 3 | 3 | 1 | -1 | 0 |
| BsmtHalfBath | 49 | 1460 | 0 | 0 | 0 | 0 | 2 | 2 | 4 | 16 | 0 |
| FullBath | 50 | 1460 | 2 | 1 | 2 | 0 | 3 | 3 | 0 | -1 | 0 |
| HalfBath | 51 | 1460 | 0 | 1 | 0 | 0 | 2 | 2 | 1 | -1 | 0 |
| BedroomAbvGr | 52 | 1460 | 3 | 1 | 3 | 0 | 8 | 8 | 0 | 2 | 0 |
| KitchenAbvGr | 53 | 1460 | 1 | 0 | 1 | 0 | 3 | 3 | 4 | 21 | 0 |
| TotRmsAbvGrd | 55 | 1460 | 7 | 2 | 6 | 2 | 14 | 12 | 1 | 1 | 0 |
| Fireplaces | 57 | 1460 | 1 | 1 | 1 | 0 | 3 | 3 | 1 | 0 | 0 |
| GarageYrBlt | 60 | 1379 | 1979 | 25 | 1980 | 1900 | 2010 | 110 | -1 | 0 | 1 |
| -GarageCars | 62 | 1460 | 2 | 1 | 2 | 0 | 4 | 4 | 0 | 0 | 0 |
| GarageArea | 63 | 1460 | 473 | 214 | 480 | 0 | 1418 | 1418 | 0 | 1 | 6 |
| WoodDeckSF | 67 | 1460 | 94 | 125 | 0 | 0 | 857 | 857 | 2 | 3 | 3 |
| OpenPorchSF | 68 | 1460 | 47 | 66 | 25 | 0 | 547 | 547 | 2 | 8 | 2 |
| EnclosedPorch | 69 | 1460 | 22 | 61 | 0 | 0 | 552 | 552 | 3 | 10 | 2 |
| X3SsnPorch | 70 | 1460 | 3 | 29 | 0 | 0 | 508 | 508 | 10 | 123 | 1 |
| ScreenPorch | 71 | 1460 | 15 | 56 | 0 | 0 | 480 | 480 | 4 | 18 | 1 |
| PoolArea | 72 | 1460 | 3 | 40 | 0 | 0 | 738 | 738 | 15 | 222 | 1 |
| MiscVal | 76 | 1460 | 43 | 496 | 0 | 0 | 15500 | 15500 | 24 | 698 | 13 |
| MoSold | 77 | 1460 | 6 | 3 | 6 | 1 | 12 | 11 | 0 | 0 | 0 |
| YrSold | 78 | 1460 | 2008 | 1 | 2008 | 2006 | 2010 | 4 | 0 | -1 | 0 |
| SalePrice | 81 | 1460 | 180921 | 79443 | 163000 | 4900 | 755000 | 720100 | 2 | 6 | 2079 |
During our data exploration efforts we identified variable(s) that can justifiably be ignored during model building, and we ranked correlation amongst various predictor variables that could be used to order inclusion one or more other variables during the model building process. Furthermore, we identified recurring variable values, require further investigation to determine whether or not additional action might be required to make use of those rows in our analysis. Finally, we identified three variables (. . . ) that are good candidates for transformation during the Data Preparation process.
————————————————————————————————————————–
Our Data Preparation efforts included an investigation of what appeared to have been an abnormally large number of records within the data set having shared values, proposing the conversion of two predictor variables to binary “0/1” factor variables, and simplifying the interpretation of the ‘black’ variable via mathematical transformation. While we also considered the possibility of transforming one or more of the predictor variables that have skewed distributions, we chose not to apply any such transforms prior to model building since normal distributions aren’t necessarily required for logistical regression modeling. Transforms can be applied if the marginal model plots for a logistic regression model show evidence of deviance between the modeled data and the actual data, but aren’t required prior to model building.
————————————————————————————————————————–
# Final Model:
fit <- lm(SalePrice ~ MSSubClass + MSZoning + LotArea + Street + LandContour +
Utilities + LotConfig + LandSlope + Neighborhood + Condition1 +
Condition2 + BldgType + OverallQual + OverallCond + YearBuilt +
YearRemodAdd + RoofStyle + RoofMatl + Exterior1st + MasVnrType +
MasVnrArea + ExterQual + BsmtQual + BsmtCond + BsmtExposure +
BsmtFinSF1 + BsmtFinSF2 + BsmtUnfSF + X1stFlrSF + X2ndFlrSF +
BsmtFullBath + FullBath + BedroomAbvGr + KitchenAbvGr + KitchenQual +
TotRmsAbvGrd + Functional + Fireplaces + GarageCars + GarageArea +
GarageQual + GarageCond + WoodDeckSF + ScreenPorch + PoolArea +
PoolQC + Fence + MoSold + SaleCondition, data = AmesHomes
)
summary(fit)
##
## Call:
## lm(formula = SalePrice ~ MSSubClass + MSZoning + LotArea + Street +
## LandContour + Utilities + LotConfig + LandSlope + Neighborhood +
## Condition1 + Condition2 + BldgType + OverallQual + OverallCond +
## YearBuilt + YearRemodAdd + RoofStyle + RoofMatl + Exterior1st +
## MasVnrType + MasVnrArea + ExterQual + BsmtQual + BsmtCond +
## BsmtExposure + BsmtFinSF1 + BsmtFinSF2 + BsmtUnfSF + X1stFlrSF +
## X2ndFlrSF + BsmtFullBath + FullBath + BedroomAbvGr + KitchenAbvGr +
## KitchenQual + TotRmsAbvGrd + Functional + Fireplaces + GarageCars +
## GarageArea + GarageQual + GarageCond + WoodDeckSF + ScreenPorch +
## PoolArea + PoolQC + Fence + MoSold + SaleCondition, data = AmesHomes)
##
## Residuals:
## Min 1Q Median 3Q Max
## -178640 -9148 0 9867 178640
##
## Coefficients: (2 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1.748e+06 1.679e+05 -10.414 < 2e-16 ***
## MSSubClass -9.636e+01 4.576e+01 -2.106 0.035411 *
## MSZoningFV 3.238e+04 1.121e+04 2.889 0.003924 **
## MSZoningRH 1.948e+04 1.117e+04 1.744 0.081424 .
## MSZoningRL 2.377e+04 9.513e+03 2.499 0.012577 *
## MSZoningRM 1.959e+04 8.869e+03 2.208 0.027386 *
## LotArea 6.962e-01 9.487e-02 7.338 3.80e-13 ***
## StreetPave 3.361e+04 1.122e+04 2.995 0.002795 **
## LandContourHLS 9.211e+03 4.836e+03 1.905 0.057036 .
## LandContourLow -8.791e+03 5.928e+03 -1.483 0.138344
## LandContourLvl 6.372e+03 3.449e+03 1.847 0.064945 .
## UtilitiesNoSeWa -3.255e+04 2.384e+04 -1.365 0.172392
## LotConfigCulDSac 7.192e+03 2.986e+03 2.408 0.016165 *
## LotConfigFR2 -6.352e+03 3.805e+03 -1.669 0.095280 .
## LotConfigFR3 -1.326e+04 1.224e+04 -1.083 0.279024
## LotConfigInside -1.294e+03 1.657e+03 -0.781 0.434873
## LandSlopeMod 5.623e+03 3.704e+03 1.518 0.129277
## LandSlopeSev -3.928e+04 1.034e+04 -3.797 0.000153 ***
## NeighborhoodBlueste 2.440e+03 1.826e+04 0.134 0.893734
## NeighborhoodBrDale -1.694e+03 1.030e+04 -0.164 0.869459
## NeighborhoodBrkSide -2.071e+03 8.794e+03 -0.235 0.813905
## NeighborhoodClearCr -1.158e+04 8.736e+03 -1.325 0.185417
## NeighborhoodCollgCr -8.600e+03 6.876e+03 -1.251 0.211205
## NeighborhoodCrawfor 1.189e+04 8.091e+03 1.469 0.142070
## NeighborhoodEdwards -1.811e+04 7.587e+03 -2.387 0.017115 *
## NeighborhoodGilbert -1.182e+04 7.264e+03 -1.627 0.103985
## NeighborhoodIDOTRR -6.420e+03 1.001e+04 -0.641 0.521357
## NeighborhoodMeadowV -8.398e+03 1.051e+04 -0.799 0.424203
## NeighborhoodMitchel -2.165e+04 7.745e+03 -2.795 0.005258 **
## NeighborhoodNAmes -1.614e+04 7.408e+03 -2.178 0.029555 *
## NeighborhoodNoRidge 2.874e+04 7.994e+03 3.595 0.000337 ***
## NeighborhoodNPkVill 6.081e+03 1.043e+04 0.583 0.559994
## NeighborhoodNridgHt 1.642e+04 7.062e+03 2.325 0.020233 *
## NeighborhoodNWAmes -1.892e+04 7.587e+03 -2.494 0.012762 *
## NeighborhoodOldTown -1.149e+04 9.040e+03 -1.271 0.203923
## NeighborhoodSawyer -1.080e+04 7.753e+03 -1.393 0.163995
## NeighborhoodSawyerW -2.851e+03 7.429e+03 -0.384 0.701207
## NeighborhoodSomerst -2.016e+03 8.567e+03 -0.235 0.814039
## NeighborhoodStoneBr 3.561e+04 7.880e+03 4.520 6.75e-06 ***
## NeighborhoodSWISU -5.781e+03 9.040e+03 -0.640 0.522592
## NeighborhoodTimber -1.241e+04 7.743e+03 -1.603 0.109216
## NeighborhoodVeenker -9.747e+01 9.859e+03 -0.010 0.992113
## Condition1Feedr 7.073e+03 4.681e+03 1.511 0.131023
## Condition1Norm 1.508e+04 3.884e+03 3.883 0.000108 ***
## Condition1PosA 7.414e+03 9.465e+03 0.783 0.433594
## Condition1PosN 1.230e+04 6.964e+03 1.766 0.077692 .
## Condition1RRAe -1.210e+04 8.246e+03 -1.468 0.142452
## Condition1RRAn 1.370e+04 6.476e+03 2.115 0.034626 *
## Condition1RRNe -5.063e+01 1.691e+04 -0.003 0.997611
## Condition1RRNn 1.004e+04 1.198e+04 0.838 0.402226
## Condition2Feedr -1.082e+04 2.130e+04 -0.508 0.611621
## Condition2Norm -1.207e+04 1.832e+04 -0.659 0.510192
## Condition2PosA 3.890e+04 2.999e+04 1.297 0.194894
## Condition2PosN -2.384e+05 2.584e+04 -9.225 < 2e-16 ***
## Condition2RRAe -1.151e+05 4.156e+04 -2.770 0.005688 **
## Condition2RRAn -1.572e+04 2.951e+04 -0.533 0.594249
## Condition2RRNn -1.077e+04 2.502e+04 -0.431 0.666849
## BldgType2fmCon 3.730e+03 8.287e+03 0.450 0.652741
## BldgTypeDuplex -3.395e+03 6.223e+03 -0.546 0.585444
## BldgTypeTwnhs -1.528e+04 6.948e+03 -2.199 0.028019 *
## BldgTypeTwnhsE -1.091e+04 5.584e+03 -1.953 0.051002 .
## OverallQual 6.683e+03 9.375e+02 7.129 1.67e-12 ***
## OverallCond 5.571e+03 7.670e+02 7.264 6.45e-13 ***
## YearBuilt 3.720e+02 6.099e+01 6.099 1.40e-09 ***
## YearRemodAdd 1.098e+02 5.058e+01 2.171 0.030088 *
## RoofStyleGable 6.925e+03 1.740e+04 0.398 0.690721
## RoofStyleGambrel 1.098e+04 1.889e+04 0.581 0.561127
## RoofStyleHip 7.291e+03 1.746e+04 0.418 0.676322
## RoofStyleMansard 2.177e+04 1.992e+04 1.093 0.274588
## RoofStyleShed 9.154e+04 3.287e+04 2.785 0.005435 **
## RoofMatlCompShg 5.785e+05 4.330e+04 13.361 < 2e-16 ***
## RoofMatlMembran 6.664e+05 5.382e+04 12.383 < 2e-16 ***
## RoofMatlMetal 6.395e+05 5.361e+04 11.929 < 2e-16 ***
## RoofMatlRoll 5.701e+05 4.958e+04 11.499 < 2e-16 ***
## RoofMatlTar&Grv 5.772e+05 4.705e+04 12.268 < 2e-16 ***
## RoofMatlWdShake 5.700e+05 4.575e+04 12.458 < 2e-16 ***
## RoofMatlWdShngl 6.321e+05 4.404e+04 14.353 < 2e-16 ***
## Exterior1stAsphShn -1.119e+04 2.396e+04 -0.467 0.640496
## Exterior1stBrkComm 6.354e+02 1.854e+04 0.034 0.972668
## Exterior1stBrkFace 1.665e+04 6.694e+03 2.488 0.012969 *
## Exterior1stCBlock -1.202e+04 2.531e+04 -0.475 0.634977
## Exterior1stCemntBd 2.371e+03 7.066e+03 0.336 0.737297
## Exterior1stHdBoard -3.807e+03 6.055e+03 -0.629 0.529604
## Exterior1stImStucc -8.253e+03 2.406e+04 -0.343 0.731673
## Exterior1stMetalSd 1.080e+03 5.933e+03 0.182 0.855546
## Exterior1stPlywood -6.775e+03 6.386e+03 -1.061 0.288875
## Exterior1stStone -6.108e+03 1.912e+04 -0.320 0.749359
## Exterior1stStucco 9.139e+02 7.439e+03 0.123 0.902244
## Exterior1stVinylSd -2.243e+02 5.970e+03 -0.038 0.970036
## Exterior1stWd Sdng -9.220e+02 5.897e+03 -0.156 0.875778
## Exterior1stWdShing -3.675e+03 7.366e+03 -0.499 0.617975
## MasVnrTypeBrkFace 6.694e+03 6.499e+03 1.030 0.303229
## MasVnrTypeNone 1.009e+04 6.560e+03 1.538 0.124189
## MasVnrTypeStone 1.084e+04 6.882e+03 1.575 0.115573
## MasVnrArea 2.194e+01 5.569e+00 3.940 8.59e-05 ***
## ExterQualFa -7.162e+03 9.706e+03 -0.738 0.460712
## ExterQualGd -2.249e+04 4.602e+03 -4.887 1.15e-06 ***
## ExterQualTA -2.231e+04 5.071e+03 -4.399 1.18e-05 ***
## BsmtQualFa -1.329e+04 5.904e+03 -2.251 0.024577 *
## BsmtQualGd -2.078e+04 3.141e+03 -6.615 5.39e-11 ***
## BsmtQualNone 1.063e+04 2.401e+04 0.443 0.658010
## BsmtQualTA -1.798e+04 3.834e+03 -4.691 3.00e-06 ***
## BsmtCondGd 1.074e+03 4.981e+03 0.216 0.829315
## BsmtCondPo 4.130e+04 2.090e+04 1.976 0.048396 *
## BsmtCondTA 4.082e+03 3.933e+03 1.038 0.299515
## BsmtCondXa NA NA NA NA
## BsmtExposureGd 1.578e+04 2.855e+03 5.528 3.90e-08 ***
## BsmtExposureMn -3.102e+03 2.861e+03 -1.084 0.278481
## BsmtExposureNo -6.046e+03 2.026e+03 -2.985 0.002892 **
## BsmtExposureXb -1.563e+04 2.267e+04 -0.690 0.490539
## BsmtFinSF1 3.457e+01 4.483e+00 7.711 2.47e-14 ***
## BsmtFinSF2 2.479e+01 5.639e+00 4.395 1.20e-05 ***
## BsmtUnfSF 1.842e+01 4.241e+00 4.344 1.51e-05 ***
## X1stFlrSF 4.797e+01 4.932e+00 9.727 < 2e-16 ***
## X2ndFlrSF 5.503e+01 3.465e+00 15.884 < 2e-16 ***
## BsmtFullBath 2.694e+03 1.732e+03 1.556 0.120056
## FullBath 2.810e+03 1.905e+03 1.476 0.140309
## BedroomAbvGr -4.667e+03 1.263e+03 -3.694 0.000230 ***
## KitchenAbvGr -1.538e+04 5.207e+03 -2.953 0.003200 **
## KitchenQualFa -1.866e+04 5.693e+03 -3.277 0.001077 **
## KitchenQualGd -2.305e+04 3.312e+03 -6.960 5.37e-12 ***
## KitchenQualTA -2.304e+04 3.732e+03 -6.175 8.84e-10 ***
## TotRmsAbvGrd 2.107e+03 8.918e+02 2.362 0.018304 *
## FunctionalMaj2 -6.493e+03 1.284e+04 -0.506 0.613186
## FunctionalMin1 5.659e+03 7.952e+03 0.712 0.476807
## FunctionalMin2 8.483e+03 7.856e+03 1.080 0.280420
## FunctionalMod 1.292e+03 9.293e+03 0.139 0.889463
## FunctionalSev -3.609e+04 2.651e+04 -1.361 0.173620
## FunctionalTyp 1.867e+04 6.848e+03 2.727 0.006484 **
## Fireplaces 2.626e+03 1.258e+03 2.087 0.037112 *
## GarageCars 4.671e+03 2.148e+03 2.175 0.029807 *
## GarageArea 1.414e+01 7.148e+00 1.978 0.048179 *
## GarageQualFa -1.233e+05 2.677e+04 -4.605 4.53e-06 ***
## GarageQualGd -1.171e+05 2.752e+04 -4.254 2.25e-05 ***
## GarageQualPo -1.374e+05 3.318e+04 -4.140 3.70e-05 ***
## GarageQualTA -1.196e+05 2.655e+04 -4.506 7.20e-06 ***
## GarageQualXg -1.372e+03 1.678e+04 -0.082 0.934851
## GarageCondFa 1.078e+05 3.150e+04 3.421 0.000642 ***
## GarageCondGd 1.020e+05 3.251e+04 3.138 0.001737 **
## GarageCondPo 1.109e+05 3.381e+04 3.279 0.001070 **
## GarageCondTA 1.103e+05 3.119e+04 3.535 0.000421 ***
## GarageCondXh NA NA NA NA
## WoodDeckSF 1.071e+01 5.545e+00 1.932 0.053597 .
## ScreenPorch 3.077e+01 1.188e+01 2.591 0.009683 **
## PoolArea 5.611e+02 1.646e+02 3.408 0.000673 ***
## PoolQCFa -1.451e+05 2.558e+04 -5.672 1.74e-08 ***
## PoolQCGd -1.161e+05 3.074e+04 -3.776 0.000166 ***
## PoolQCNone 1.816e+05 8.953e+04 2.029 0.042684 *
## FenceGdWo 7.657e+03 4.631e+03 1.653 0.098507 .
## FenceMnPrv 9.441e+03 3.779e+03 2.498 0.012602 *
## FenceMnWw 1.081e+03 7.800e+03 0.139 0.889780
## FenceNone 8.136e+03 3.465e+03 2.348 0.019007 *
## MoSold -3.489e+02 2.318e+02 -1.505 0.132534
## SaleConditionAdjLand 9.769e+03 1.271e+04 0.768 0.442358
## SaleConditionAlloca -1.841e+03 8.139e+03 -0.226 0.821075
## SaleConditionFamily 1.103e+03 5.763e+03 0.191 0.848277
## SaleConditionNormal 5.739e+03 2.593e+03 2.213 0.027040 *
## SaleConditionPartial 1.910e+04 3.638e+03 5.250 1.77e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 22440 on 1304 degrees of freedom
## Multiple R-squared: 0.9287, Adjusted R-squared: 0.9202
## F-statistic: 109.6 on 155 and 1304 DF, p-value: < 2.2e-16
————————————————————————————————————————–
## $title
## [1] "Zoning Proportions"
##
## attr(,"class")
## [1] "labels"
## NULL
## $title
## [1] "Exterior1st Proportions"
##
## attr(,"class")
## [1] "labels"
## Warning: Removed 1 rows containing non-finite values (stat_count).